-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23122][PYSPARK][FOLLOW-UP] Update the docs for UDF Registration #20348
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for one comment.
python/pyspark/sql/udf.py
Outdated
@@ -200,7 +200,7 @@ def __init__(self, sparkSession): | |||
@since("1.3.1") | |||
def register(self, name, f, returnType=None): | |||
"""Registers a Python function (including lambda function) or a user-defined function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Register
instead of Registers
to be consistent with other descriptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @gatorsmile, sorry I don't know why I missed your comment in #20288.
python/pyspark/sql/udf.py
Outdated
@@ -213,6 +213,10 @@ def register(self, name, f, returnType=None): | |||
`returnType` can be optionally specified when `f` is a Python function but not | |||
when `f` is a user-defined function. Please see below. | |||
|
|||
To register a non-deterministic Python function, users need to first build | |||
a nondeterministic user-defined function for the Python function and then register it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nondeterministic
-> non-deterministic
or the opposite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nondeterministic
is better
python/pyspark/sql/udf.py
Outdated
@@ -213,6 +213,10 @@ def register(self, name, f, returnType=None): | |||
`returnType` can be optionally specified when `f` is a Python function but not | |||
when `f` is a user-defined function. Please see below. | |||
|
|||
To register a non-deterministic Python function, users need to first build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we switch this paragraph with `returnType` can be optionally s ...
? I intended to explain returnType
with case 1. and case 2. together.
Test build #86458 has finished for PR 20348 at commit
|
Test build #86461 has finished for PR 20348 at commit
|
## What changes were proposed in this pull request? This PR is to update the docs for UDF registration ## How was this patch tested? N/A Author: gatorsmile <[email protected]> Closes #20348 from gatorsmile/testUpdateDoc. (cherry picked from commit 7328116) Signed-off-by: gatorsmile <[email protected]>
@HyukjinKwon That is fine. I am reviewing all the API changes made in Spark 2.3 release. Thanks! Merged to master/2.3 |
What changes were proposed in this pull request?
This PR is to update the docs for UDF registration
How was this patch tested?
N/A